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ABSTRACT 


This work focuses on modeling and analysis of crop yield over space and time. Specifically, the onion yield data set was used. This study on estimating the future yield 
of onions in major producing states in India. To achieve this, we applied time series on onion yield data recorded from 1978 to 2020, as per availability from the website 
of Ministry of Agriculture, Government of India. By using SPSS software, The data are analyzed using the autoregressive integrated moving average (ARIMA) model 
to best fit the model. Selected best models were used to estimate onion yield. The selection of a suitable model requires determining the efficiency of different models 
in predicting future outcomes and selecting the most suitable model for the prediction work. To develop a suitable forecast ARIMA model for agricultural data. To 
study the predictive ability of the univariate ARIMA model to suggest an optimal model, the best predictive model was selected. 
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INTRODUCTION: 

Farming sector faces several challenges from land preparation to harvesting and 
marketing of farm produce. The consumers of farm output though are healthy 
and wealthy; they are able to bargain to the lowest price for the output realized. 
The traders are having collusion and their association is strong enough to bargain 
from the farmers. But the farmers and the farmer organizations are weak in their 
association and cannot be unified to establish an organization to the fullest spirit 
to command price for their produce. Though there were few farmer organizations 
in our country, they cannot raise to the expected level in achieving or distributing 
the farm produce. During the time of harvest, supply will be excess and the 
demand will be less and hence proper storage and distribution is a must. State 
Governments took effort to procure the principal crop outputs particularly in 
respect of cereals like rice and cannot procure other farm produce timely and 
hence balancing the area under crop and its output production becomes much 
more important. For that, information on price availability, demand for that pro- 
duce and expected price for the output by the farmers are to be provided in 
advance prior to the crop season. In this respect, forecasting is the tool that will 
help to predict the yield and price in advance. Forecasting refers to the practice of 
predicting what will happen in the future by taking into consideration events in 
the past and present. Basically, it is a decision-making tool that helps businesses 
cope with the impact of the future’s uncertainty by examining historical data and 
trends. It is a planning tool that enables businesses to chart their next moves and 
create budgets that will hopefully cover whatever uncertainties may occur (CFI, 
2022). This study is the one that aimed at forecasting the yield of onion in India. 


METHODOLOGY: 

This study aimed at forecasting the yield of Onion in India. For that the basis of 
forecasting is to be discussed to develop an overall idea. The first step in the pro- 
cess is developing the basis of the investigation and identifying where the busi- 
ness is currently positioned in the market. 


Forecasting Methods: 
Businesses choose between two basic methods when they want to predict what 
can possibly happen in the future, namely, qualitative and quantitative methods. 


1. Qualitative Method: Otherwise known as the judgmental method, quali- 
tative forecasting offers subjective results, as it is comprised of personal 
judgments by experts or forecasters. Forecasts are often biased because 
they are based on the expert's knowledge, intuition, and experience, and 
rarely on data, making the process non-mathematical. 


One example is when a person forecasts the outcome ofa finals game in the 
NBA, which, of course, is based more on personal motivation and interest. 
The weakness of such a method is that it can be inaccurate. 


2. Quantitative Method: The quantitative method of forecasting is a mathe- 
matical process, making it consistent and objective. It steers away from 
basing the results on opinion and intuition, instead utilizing large amounts 
of data and figures that are interpreted. 


Features of Forecasting: 
Here are some of the features of making a forecast: 
¢ Forecasts are created to predict the future, making them important for plan- 
ning. 


¢ Forecasts are based on opinions, intuition, guesses, as well as on facts, fig- 
ures, and other relevant data. All of the factors that go into creating a fore- 
cast reflect to some extent what happened with the business in the past and 
what is considered likely to occur in the future. 


¢ Most businesses use the quantitative method, particularly in planning and 
budgeting activities 


Collection of Data: 

The secondary data was collected from the website of Ministry of Agriculture, 
Government of India. Onion production over a period of time was gathered from 
the above website and is analyzed using ARIMA models. By using SPSS soft- 
ware, the data was analyzed to fit the best model using an autoregressive inte- 
grated moving average (ARIMA) model. The selected best models were used to 
forecast the onion yield. 


ARIMA Modeling: 

In general, an ARIMA model is characterized by the notation ARIMA (p,d,q) 
where, p, d and q denote orders of autoregression integration (differencing) and 
moving average respectively. Time series is a linear function of past actual values 
and random shocks. For instance, given a time series process {Yi}, a first order 
auto-regressive process is denoted by ARIMA (1,0,0) or simply AR(1) and is 
given by 


Yi=pt+@lYi-l+et 


Stationary: 

A stationary time series does not depend on the time when a particular point is 
observed. Each point in time has a value that is not dependant on another point in 
time, such as white noise. The plots below are some examples of stationary time 
series. Other examples may include cyclic data with non-consistent periods. 


Differencing: 

Differencing is a time series transformation that attempts to eliminate time- 
dependent factors from the time series such as trend and seasonality. There are dif- 
ferent orders of differencing; the equation below shows the first-order differ- 
ence. It is simply the difference between the current and previous observation: 

y; =): Vita 
After the first-order difference, if the time series is still not stationary, differenc- 
ing once more will give you the second-order differencing. 


W=Vi Via 


= (9, -4)-Oi4 -¥e2) 


=y,-2V4+y, 


tt" 7t-2 
The order of the differencing can be defined in the d parameter of the model. 


Autoregressive Models: 
An autoregressive (AR) model, defined as being the regression of it, is simply a 
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multiple linear regression having the previous time steps as parameters to the 
function. Itusually performs better on stationary time series data. 


V=HC+GYV y+ Oy pt... -- ~ ?,):-p +6&, 


The order of the AR model, p defines the number of previous time steps that are 
accounted for in the current observation. For example, AR(1) accounts for time 
step t-1 for the current observation, AR(2) accounts for time steps t-1 and t-2. 


Moving Average Models: 
Unlike AR models, Moving Average (MA) models predict the next step based on 
the errors of the previous steps. 


J, =C+6,4+06,,4+0,5,,+..... +O,8,, 
or can be written as, 
(\-¢,B-..... $,B”)(-B)*y,=c+(+0B+..... +0,B*)e, 


The order of the MA model, q is similar to AR's. A larger q means that it takes into 
account of more number of previous time steps. ARIMA By combining auto 
regression, differencing and moving average, we get an ARIMA (p, d, q) model. 
p, d, q are respective parameters for AR, differencing and MA. 


' ' ' 
Vp =CO+tOyiyt..--- +O Vip + HE,1 + Sle @ srs +06 +6 


Here, C is some constant + linear combination of the previous p + linear combi- 
nation of the previous q error terms + the error this time (€,). y’, is the differenced 
series. Itneeds to be integrated (opposite of differencing) to get the actual series. 
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38. 2015 15857 01.32 
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40. 2017 18103 01.28 
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Figure: 1.1: The Magnitude of Yield of Onion in different Decades of 

Time 


Modeling and Forecasting: 
* Toperform an actual forecast and assess quality of forecasts. 


RESULTS AND DISCUSSION: 


Using the above models, the data outlined in Table 1 was analysed and the results 


are presented in the subsequent headings. 


Table 1: Area and Productivity of Onion in India 


Identification: 
¢ The graphs of ACF and PACF are drawn for all the observed variables, to the 
fitted models. 


Sequence Plot: 

The plotted diagram above indicated that over a period of time, the yield of onion 
per ha is found to be marginally increasing up to the year 2000 and later on the 
yield of onion and its magnitude found to be steadily increasing in an upward 
direction indicating that the onion production technology would have contrib- 
uted to the higher yield over a period of time. Release of new hybrids in onion and 
its technology might be the reason for the upward swing. The involvement of hor- 
ticulturist in developing onion hybrid and the role of extension functionaries in 
taking the technology to the farm level are the main reasons for adoption of tech- 
nology and an upward yield realization. 


Using this sequence diagram, the analysis has predicted the yield for the remain- 
ing years and is discussed suitably. 


Onion_Production 


Figure: 1.2: Graph delineating Non Stationary in Onion Yield 


Table 2: Autocorrelation and Partial Autocorrelation Correlograms for 
non stationary 


S. No. Year Yield in Kgs./Hectare | Area in Million Hectares 
1. 1978 10403 00.21 
2. 1979 10232 00.24 
3. 1980 9961 00.25 
4. 1981 10562 00.25 
5. 1982 10330 00.24 
6. 1983 9982 00.27 
7. 1984 11139 00.28 
8. 1985 10202 00.28 
9. 1986 9659 00.26 
10. 1987 9857 00.27 
11. 1988 10620 00.32 
12. 1989 10176 00.30 
13. 1990 10686 00.30 
14. 1991 11088 00.32 
15. 1992 10791 00.32 
16. 1993 10902 00.37 
17. 1994 10661 00.38 
18. 1995 10316 00.40 
19. 1996 10348 00.40 

20. 1997 9091 00.40 
21. 1998 11391 00.47 
22. 1999 9932 00.49 
23. 2000 10786 00.42 
24. 2001 10686 00.45 
25. 2002 9912 00.42 
26. 2003 11784 00.50 
27. 2004 11718 00.55 
28. 2005 13118 00.66 
29. 2006 12655 00.70 
30. 2007 12974 00.70 
31. 2008 16260 00.83 
32. 2009 16079 00.76 
33. 2010 14210 01.06 
34. 2011 16109 01.09 
35. 2012 15989 01.05 
36. 2013 16120 01.20 


Log |Autocorrelations| Std. Error | Partial Autocorrelations | Std. Error 
1 .873 149 .873 154 
2 .786 147 100 154 
3 .713 145 035 154 
4 629 143 -.067 154 
5 580 141 .092 154 
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Lag Number 


Figure: 2.1: Significance of Correlogram Plots 


6 492 Correlogram Plots: 
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Correlogram Plot: Since the above figure Correlogram indicates stationary, maximum lag order 16. 
Onion_Production osiaibaian Forecast: 
Donan 7 In this section, already defined models are used to forecast the Onion Yield from 
"I = Weeconmete 1h SNe 1978-2020. Here also we perform the forecasting accuracy assessment at the 
end of this session, comparing forecast with the actual data. 
5 Forecast Model Statistics: 
$0 3 00 Based on the forecasting performance, testing and estimating results, ARIMA (1, 
PR Se Pere en é 1, 1). The residual ACF and PACF of ARIMA (1, 1, 1) provides a slightly better 
ail oy result. We can see that ACF and PACF of ARIMA (1, 1, 1) has a spike crossing the 
+0. | boundary at lag 1, while nothing is crossing the boundary in ACF and PACF 
| of ARIMA (1, 1, 1). Itmeans that ARIMA (1, 1, 1) is slightly more stable), which 
ee ERRERESTESUET probably provides better forecasting results. 
Tistse teen nnn iagnasbae 


Table: 3: Model statistics for Onion Yield 
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Model Statistics 
Since the above correlogram indicates non- stationary, maximum lag order 16. Number 
Number | Model Fit | J iung-Box Q(18) | of 
Model of statistics Outliers 
“p00 Predictors - 
a, R-) statistics | DF Sig. 
= square 
oe 0 0.167 10.937 | 16 |0.813} 0 
ec Model_1 
Ss 2000: 
2 Table: 4: Forecast for Onion Yield with Upper Control and Lower 
E 1000 Control Limits 
1 
8 Years to which Forecast Figures | Upper Control | Lower Control 
Oo ° Forecast was Made of Yield / Ha Limits Limits 
2020 19006 20902 17109 
00) 2021 19225 21402 17048 
2022 19434 21904 16963 
2000 2023 19644 22371 16916 
2024 19854 22817 16891 
2025 20063 23244 16882 
Figure: 2.2: The Sequence plot of Stationarity aves eae is bests 
2027 20483 24060 16906 
Table: 2A: The Results of Autocorrelation and Partial Autocorrelations 2028 20693 24453 16933 
por Sag onaTy 2029 20903 24837 16969 
Log | Autocorrelations | Std. Error | Partial Autocorrelations | Std. Error 2030 21112 25213 17012 
1 -370 151 -370 156 2031 21322 25583 17061 
2 -.083 149 ~.254 156 2032 21532 25947 17117 
3 .234 147 126 156 
4 -.132 .145 .000 156 22,600 oes 
5 .244 143 306 156 
6 -.194 141 -.054 156 ‘es 
7 -.025 139 -.068 156 ives 9 
8 140 137 -.052 156 3 s 
9 -.131 135 -.052 .156 5 15,000 3 
10 211 133 196 156 = 2 
1 -.051 .130 71 .156 Iga S 
12 -.154 .128 -.075 .156 = 
10,000 a 
13 .158 126 -.095 156 2 
14 -.018 124 7.031 156 1.3.5 7 9111315171921 232527 2931 333537 3941 4345474951 5355 
15 -.005 121 -.007 156 Date 
16 -.018 119 044 156 


Figure: 4.1 Forecasted Yield of Onion and its Graphical Signature 
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Table: 5: Results of Goodness of Fit 


Model Fit 
Fit Statistic Mean Minimum Maximum 
Stationary R-squared 0.167 0.167 0.167 
R-squared 0.902 0.902 0.902 
RMSE 937.277 937.277 937.277 
MAPE 5.665 5.665 5.665 
MaxAPE 18.973 18.973 18.973 
MAE 688.584 688.584 688.584 
MaxAE 3.085E3 3084.933 3084.933 
Normalized BIC 13.958 13.958 13.958 


The asterisks below indicate the best (that is, minimized) values of the respective 
information criteria, 
¢ MAPE=Mean Absolute Percentage Error, 


* RMSE=Root Mean Square Error and Normalized 
* BIC=Normalized Bayesian Information criterion. 


The above analysis tells us to identify the model as ARIMA (1,1,1) because all cri- 
teria significant. OLS estimates, observations 1978-2020, R squared = 90.20% 
our prediction is accurate. 


Goodness of Fit: 

Table 5 outlined the details of goodness of fit statistics for the Onion Yield data. 
R-squared represents an estimate of the proportion of the total variation in the 
series that is explained by the model. Here the R2 found to be 90.20 per cent indi- 
cates that the 90 per cent of the variation in the yield was explained by the inde- 
pendent variables included. Largest value (maximum value) indicates a more 
accurate prediction and it means that the model does an excellent job of explain- 
ing the observed variations in the series. Mean Absolute Percentage Error 
(MAPE) for the model is a measure of how much a dependent series varies from 
its model-predicted level. Root Mean Square Error (RMSE), indicates that the 
square root of mean square error is a measure of how much a dependent series var- 
ies from its model-level of prediction, expressed in the same units as the depend- 
ent series. This measure is useful for imagining a worst-case scenario for the fore- 
cast model. 


SALIENT FINDINGS: 
« ARIMA (1,1,1) was found to be the better applicable model from which the 
predictions are made for 2020-2032. 


¢ ARIMA was used for the reasons of its capabilities of making predictions 
using a time series data with various kinds of pattern and with auto- 
correlations between the successive values in the time series. 


¢ The study was also tested statistically and validated the residuals (forecast 
errors). 


¢ The fitted ARIMA time series and residuals are seemed to be normally dis- 
tributed with mean 0 and constant variance. Hence it can be concluded that 
the selected seasonal ARIMA (1,1,1) will provide an adequate model for 
interpreting and forecasting gold price in India. 


¢  TheARIMA(1,1,1) model predicted indicates an increase in the Onion Yield 
for the years selected for the forecast study. 


SUMMARY AND CONCLUSIONS: 

In this research study, researcher analyzed and obtained the forecast of Onion 
Yield in India using ARIMA models. The result of the study conclude that 
ARIMA (1,1,1)model is the more appropriate model for forecasting Onion Yield 
in India. The forecast model and the forecast graph that the Onion Yield is rap- 
idly increasing with the passage of year. Overall, we can see that ARIMA (1,1,1) 
provide a good fit for Onion Yield in India. Its gives a fairly accuracy forecasting. 
However, although forecast from 2020-2032 are within the 95 per cent interval, 
the graph shows that the blue line of actual data has gradually moving out of the 
confidence interval. 
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